Background

What

This document contains the data analysis steps performed, as well as the presentation for the Google Data Analytics course.
Project chosen: Capstone case study #1:
How Does Cyclistic Bike-Share Company Navigate Speedy Success?

Scenario

Cyclistic is a bike-share company in Chicago. The company marketing director believes the company’s future success depends on maximizing the number of annual memberships. Therefore, the marketing analysts team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, the team will design a new marketing strategy to convert casual riders into annual members. Cyclistic executives must approve the recommendations, so recommendations must be backed up with compelling data insights and professional data visualizations - all of which shall be presented here.

About the company

In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geo-tracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members.

Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, the Marketing Director believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, the director believes there is a very good chance to convert casual riders into members. The director notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.

The director has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. The Marketing Director and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends.

Analysis Process

ASK

Problem Definition

Cyclistic limitation on profitability due to not enough annual members. At the moment the company has sufficient total riders however significant percentage of these riders are casual.

The Why

  • Why are casual riders not converting to annual membership?

  • How do causal riders differ from member riders?

Business Task

  • Increase membership number by converting casual riders to members.

  • Extract insights from past data on how do casual riders utilize the shared bikes differently than member riders to provide ideas on what strategy may help in converting causal riders to member ones.

Stakeholders

  • Director of Marketing who is responsible for the design and implementation of new initiatives and marketing campaign and thus must be presented with data backed insights and recommendations.
  • The Marketing Analytics Team who collaborates on all stages of the data analytics process. It provides information and critique; and needs to informed throughout the process.
  • The Executive Team who must approve the recommendations and the proposed marketing program and must be presented with detailed analysis covering all fundamentals.

Note - Recommendation shall be provided. However it is beyond this analysis process to design a marketing campaign to realize the recommendations.

Preperation

Data Source

The datasets to be used for the analysis are provided by the Cyclistic company. It is a collections of past observations for a little more than a year of all rides. Rides datasets include attributes such as:

  • unique ride ID
  • start and end locations including plain-English names and coordinates.
  • date and time.
  • the type of rider whether member or casual.

Station datasets includes station information such as:

  • unique station ID
  • plain location as well as coordinates,
  • capacity

Ride Observation DO NOT include any information about the riders themselves.

Data Organization

All datasets are provided in CSV spreadsheet format. The datasets containing the ride information are organized in long format where each unique ride is contained in a single row comprises all the ride observation attributes (mentioned above). The dataset containing station information is also organized in long format where for each unique station all the station attributes are contained in a single row.

Data Utilized

12 datasets for 12 consecutive months starting with the one for December 2020 and ending with the one for November 2021; and a station information dataset were retrieved from https://divvy-tripdata.s3.amazonaws.com/index.html on December 27th, 2022.

Data Location

All datasets used were downloaded from the provided URL to the analysis local machine.

Data Credibility, Liscening, Security, privacy and accessability

  • The data are provided by Google Inc. via the official Coursera Inc. site and the pages dedicated to the Google Data Analytics course.

  • URL for the data is provided to course attendees who must log into the Coursera site and are verified by Coursera.

  • All data used for the analysis were downloaded to the analyst local machine in a private office. Local machine is not connected to any public network; it requires log-in credentials; and is protected by real-time firewall and virus protection. Thus data are kept secured via the conditions described above.

  • Datasets do not contain rider personal information of any kind or rider identifiers such as age, sex and address. Full anonymity is maintained.

  • Licensing is provided from Coursera/Google and the source of the data is clearly indicated throughout the presentation.

Sufficiency of Data and Problems with Data

  • Data Time Period is reasonably sufficient since 12 months covers different weather and holiday seasons through the year. I must note though that 2021 was not a typical year due to the Covid-19 pandemic.
  • Ride Date Elements is somewhat sufficient. Date-Time for all rides is provided thus indicators such ride duration or peak times with respect to member-type can be extracted. However some records containing start date later than end date are present and are omitted due to lack of sufficient information required to perform correction such as: equalize the month to match start and end or perhaps flip the two. However, since the number of observations with these errors is negligible in comparison to number observations without those - omitting these appears as best choice.
  • Station Location for most rides is provided thus indicators such as preferred location with respect to rider type can be extracted. However some ride observations are missing start and/or end station name (i.e. plain location). Unfortunately the single dataset containing the station information does not include information for all possible latitude/longitudes in the ride observation records. Hence filling in this info utilizing the station dataset is not possible.
  • Missing Information that may shed more light on the difference between riders type such as:
  • unique rider ID - helps to identify returning riders.
  • rider age group - give hints regarding rides pattern among for example retirees vs workers vs students.
  • city of residency - allows to distinguish between residence who are more likely to be motivated to purchase membership versus temporary visitors.
  • Furthermore, since we know nothing about the riders - it is impossible to ascertain whether the data are somewhat population biased.

Preparation Steps

Installing useful packages needed for the analysis process

  • tidyverse - add more on top of basic data manipulation.
  • here - better path management for fining files.
  • magrittre - better code readability pipes.
  • janitor - extra functionality for cleaning data.
  • plyr - splitting and merging large data.
  • lubridate - extra functionality for handling date and time.
  • ggeasy - additional plot formatting for ggplot2.
  • ggsci - additional color palettes.
  • leaflet - geo mapping.
  • gesphere - in case geo calculations are required.
  • htmltools - handling html widgets such as plots and tables.
  • plotly - for making plots interactive.
  • DT - for making tables interactive.

 

Loading 12 consecutive months ride observations datasets and create a single complete year of ride observation frame

# Create year ride observation frame
####################################
# Load rides data from the bicycle trips CSV files 
# All file from CSV sub-directory are loaded and their full path stored in a list
all_rides <- list.files(path = "./CSV/Ride Data", pattern = "*tripdata.csv", full.names = TRUE) %>%  
  # repeatedly apply read_csv to all files
  lapply(read_csv) %>% 
  # Combine data sets into one data set 
  bind_rows                                                       
# Let's take a look at few rows of data, set size and column headers

glimpse(all_rides)
## Rows: 5,479,096
## Columns: 13
## $ ride_id            <chr> "70B6A9A437D4C30D", "158A465D4E74C54A", "5262016E0F~
## $ rideable_type      <chr> "classic_bike", "electric_bike", "electric_bike", "~
## $ started_at         <dttm> 2020-12-27 12:44:29, 2020-12-18 17:37:15, 2020-12-~
## $ ended_at           <dttm> 2020-12-27 12:55:06, 2020-12-18 17:44:19, 2020-12-~
## $ start_station_name <chr> "Aberdeen St & Jackson Blvd", NA, NA, NA, NA, NA, N~
## $ start_station_id   <chr> "13157", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ end_station_name   <chr> "Desplaines St & Kinzie St", NA, NA, NA, NA, NA, NA~
## $ end_station_id     <chr> "TA1306000003", NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ start_lat          <dbl> 41.87773, 41.93000, 41.91000, 41.92000, 41.80000, 4~
## $ start_lng          <dbl> -87.65479, -87.70000, -87.69000, -87.70000, -87.590~
## $ end_lat            <dbl> 41.88872, 41.91000, 41.93000, 41.91000, 41.80000, 4~
## $ end_lng            <dbl> -87.64445, -87.70000, -87.70000, -87.70000, -87.590~
## $ member_casual      <chr> "member", "member", "member", "member", "member", "~

 

Creating station info frame

  • Loading Stations dataset.
  • Loading quarter rides dataset.
  • When available, extracting additional stations - not provided in the Station Dataset - from Ride Observations Datasets.
  • Combining all stations to one Available Stations info Frame.
# Create station info frame from all available data to try to be used later
###########################################################################
# Load stations info from the stations CSV (single) file
available_stations <- read_csv("./CSV/Station Data/Divvy_Stations_2014-Q3Q4.csv") %>% 
  select("id", "name", "latitude", "longitude")

# Load the quarter data, extract stations ad combine with station info so far
Divvy_Trips_2020_Q1 <- read_csv("./CSV/Station Data/Divvy_Trips_2020_Q1.csv")

# extract start station info
temp <- Divvy_Trips_2020_Q1 %>% 
  select(start_station_id, start_station_name, start_lat, start_lng) %>% 
  distinct(start_station_name, .keep_all= TRUE) %>% 
  setNames(c("id", "name", "latitude", "longitude")) %>% 
  na.omit()

# combine
available_stations <- available_stations %>% rbind(temp) %>% 
  distinct(name, .keep_all= TRUE)

# extract end station info
temp <- Divvy_Trips_2020_Q1 %>% 
  select(end_station_id, end_station_name, end_lat, end_lng) %>% 
  distinct(end_station_name, .keep_all= TRUE) %>% 
  setNames(c("id", "name", "latitude", "longitude")) %>% 
  na.omit()

# combine
available_stations <- available_stations %>% rbind(temp) %>% 
  distinct(name, .keep_all= TRUE)

# extract start station info from all rides frame
temp <- all_rides %>% 
  select(start_station_id, start_station_name, start_lat, start_lng) %>% 
  distinct(start_station_name, .keep_all= TRUE) %>% 
  setNames(c("id", "name", "latitude", "longitude")) %>% 
  na.omit()

# combine
available_stations <- available_stations %>% rbind(temp) %>% 
  distinct(name, .keep_all= TRUE)

# extract end station info from all rides frame
temp <- all_rides %>% 
  select(end_station_id, end_station_name, end_lat, end_lng) %>% 
  distinct(end_station_name, .keep_all= TRUE) %>% 
  setNames(c("id", "name", "latitude", "longitude")) %>% 
  na.omit()

# combine into the "final" station info frame
#############################################
available_stations <- available_stations %>% rbind(temp) %>% 
  distinct(name, .keep_all= TRUE) %>% 
  na.omit() %>% 
  arrange(name)

 

Result - 904 Distinct Stations

 

I see that Cyclistic grew its station number from 692 specified when the project was created. I also know that there are possibly more stations that are not specified in any dataset and fall into the NA category.
I considered filling in missing station names via the following process:
Calculating distances between coordinates lacking station names and the known stations, and selecting the closest station accordingly.
However the coordinates provided in observations that lack station names do not have sufficient accuracy (2 decimal points in stead of 6). Thus a station can not be unambiguously identified.
 
Available Stations Information Table
 

Cleaning

Replacing empty alphanumeric fields with NA

  • Removing leading and trailing spaces from alphanumeric fields.
  • Replacing the empty values with NA.
  • Recording how many NA values for essential variables are.
cleaned_rides <- all_rides %>%  
  mutate(across(where(is.character), str_trim)) %>%
  mutate(across(where(is.character), ~na_if(., "")))

Recording how many NA I have for essential attributes

num_empty_start_station_name <- sum(is.na(cleaned_rides$start_station_name))
num_empty_start_station_name_and_id <- sum(is.na(cleaned_rides$start_station_name) & 
                                             is.na(cleaned_rides$start_station_id)    )
num_empty_started_at <- sum(is.na(cleaned_rides$started_at))
num_empty_ended_at <- sum(is.na(cleaned_rides$ended_at))
  • Number of NA station names: 651445
  • Number of NA station names and ID: 651442
  • Number of NA trip starting coordinates: 0
  • Number of NA trip ending coordinates: 0

Observations that do not have station names usually do not have station IDs as well other than 3. This difference is negligible.

I don’t have missing coordinates so trips can be mapped if necessary.

Removing dupplicate rides observations.

Each ride has unique ID so I can check for duplication on this attribute.

cleaned_rides <- distinct(cleaned_rides, ride_id, .keep_all= TRUE)

num_cleaned_rides <- nrow(cleaned_rides)

Filter records where trip start date-time is later than trip end date-time or either is NA

# Let's count the dat-time error cases
num_date_time_errors <- sum(is.na(cleaned_rides$started_at) | 
                            is.na(cleaned_rides$ended_at) | 
                            (cleaned_rides$ended_at <= cleaned_rides$started_at))

Most of the records have correct starting and ending date-time for each trip. For 0.02% that is not the case. In theory, I could have flipped start and end date-time attributes when the first is later than the latter. However, I decided not to do so since I do not know the source of the error (in real life situation I would have checked with the relevant people). Thus, I decided to eliminate these records from the analysis - luckily - their number is negligible.

# Now filter out those records with date-time errors
cleaned_rides <- filter(cleaned_rides, !is.na(started_at) & !is.na(ended_at) & 
                                   (ended_at > started_at))

Sorting the cleaned ride observations by trip start date-time from earliest to latest.

I verified no duplication of ride records, empty fields were handled and errors were corrected.

cleaned_rides <- arrange(cleaned_rides, started_at)

num_cleaned_rides <- nrow(cleaned_rides)

5,478,022 Ride observation records can be used for analysis.

Note that 651,379 ride observation records do not have trip start station name (12%).

 

Rides Availabe Observations Table

 

Analyzing

I will try to extract long term differences between member and casual riders. For example:

For our analysis I will use the start trip location when I want to group by location.

Calculate useful values and store in new columns

  • Trip Start Month.
  • Trip Start Day-of-the-Week.
  • Trip Start Hour.
  • Trip Duration in Minutes.
# Note: data.table has its own wday function (R’s global scoping) 
# so I work around it by prefixing the call to wday.
cleaned_rides <- cleaned_rides %>%
  mutate(started_at_date = lubridate::as_date(started_at),
         started_at_month = lubridate::month(started_at, label = TRUE, abbr = FALSE),
         started_at_week_day = lubridate::wday(started_at, label = TRUE, abbr = FALSE), 
         #started_at_hour = format(started_at, format = "%I %p"),  # as a character object
         started_at_hour = hms::as_hms(started_at),  # as time object
         trip_duration_minutes = round(difftime(ended_at, started_at, units = "mins"), 2)
  )
glimpse(cleaned_rides) 
## Rows: 5,478,022
## Columns: 18
## $ ride_id               <chr> "1C46BF5EB60CC524", "1405BFC02FDB5190", "892ECFA~
## $ rideable_type         <chr> "electric_bike", "electric_bike", "docked_bike",~
## $ started_at            <dttm> 2020-12-01 00:01:15, 2020-12-01 00:01:27, 2020-~
## $ ended_at              <dttm> 2020-12-01 00:06:53, 2020-12-01 00:06:33, 2020-~
## $ start_station_name    <chr> NA, NA, "Larrabee St & Armitage Ave", "Wabash Av~
## $ start_station_id      <chr> NA, NA, "TA1309000006", "KA1503000015", "TA13070~
## $ end_station_name      <chr> NA, "Wentworth Ave & 63rd St", "Sedgwick St & We~
## $ end_station_id        <chr> NA, "KA1503000025", "13191", "13158", "13108", "~
## $ start_lat             <dbl> 41.79000, 41.78000, 41.91808, 41.87947, 41.96797~
## $ start_lng             <dbl> -87.59000, -87.62000, -87.64375, -87.62569, -87.~
## $ end_lat               <dbl> 41.80000, 41.78010, 41.92217, 41.87764, 41.97382~
## $ end_lng               <dbl> -87.60000, -87.62971, -87.63889, -87.64962, -87.~
## $ member_casual         <chr> "member", "casual", "member", "member", "member"~
## $ started_at_date       <date> 2020-12-01, 2020-12-01, 2020-12-01, 2020-12-01,~
## $ started_at_month      <ord> December, December, December, December, December~
## $ started_at_week_day   <ord> Tuesday, Tuesday, Tuesday, Tuesday, Tuesday, Tue~
## $ started_at_hour       <time> 00:01:15, 00:01:27, 00:07:08, 00:11:37, 00:21:2~
## $ trip_duration_minutes <drtn> 5.63 mins, 5.10 mins, 2.90 mins, 9.90 mins, 6.6~
Rides Availabe Observations Table

 

## Rows: 3
## Columns: 5
## $ `Total Rides`                 <int> 2989093, 2488929, 5478022
## $ `Total Rides Duration`        <drtn> 41132838 mins, 80098827 mins, 121231665 ~
## $ `Average Ride Duration (min)` <drtn> 13.76 mins, 32.18 mins, 22.13 mins
## $ `Ride Duration STD`           <dbl> 27.85, 263.26, 178.87
## $ `Ride Duration CV`            <dbl> 2.02, 8.18, 8.08

Prepare a starting trip station statistics and store it

NOTE
All 651,445 unknown start trip stations out of the total 5,478,022 (12%) will be grouped into an NA category.
For rides lacking start trip station name, I tried to find the closest known station. I attempted to calculate distances from their start trip coordinates to any of the known stations. However that was proven to be impossible due to:
Reduced accuracy for coordinates supplied for rides without station name.
For this analysis I shall assume that a station pattern, if exists, can be extracted from the known stations only.

temp1 <- cleaned_rides %>% 
  distinct(start_station_name, .keep_all= TRUE) %>% 
  select(start_station_name, start_lat, start_lng) 

# I use suppressWarning on min and max function so as not to get -Inf warning for 0 rides
temp2 <- cleaned_rides %>%
  dplyr::group_by(start_station_name) %>%
  dplyr::summarise(total_rides = n(), 
                   total_member_rides = sum(member_casual == "member"),
                   total_casual_rides = sum(member_casual == "casual"),
                   mean_duration_minutes_member = mean(trip_duration_minutes[member_casual == "member"], na.rm = TRUE), 
                   mean_duration_minutes_casual = mean(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE),
                   min_duration_minutes_member = suppressWarnings(min(trip_duration_minutes[member_casual == "member"], na.rm = TRUE)), 
                   min_duration_minutes_casual = suppressWarnings(min(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE)),
                   max_duration_minutes_member = suppressWarnings(max(trip_duration_minutes[member_casual == "member"], na.rm = TRUE)), 
                   max_duration_minutes_casual = suppressWarnings(max(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE))
                   ) %>% 
  mutate_all(function(x) ifelse(is.infinite(x), 0, x)) %>%
  mutate_all(function(x) ifelse(is.nan(x), 0, x))

station_coordinates <- merge(temp1, temp2, by = "start_station_name", sort = TRUE) %>% 
  arrange(start_station_name)

 

Available Stations Ride Statistic Table

 

Create “pivote-like” tables holding specific statistics

Daily ride statistics per rider type

daily_member_type_rides <- cleaned_rides %>% 
  dplyr::group_by(started_at_date) %>%
  dplyr::summarise(total_rides = n(), 
                   total_member_rides = sum(member_casual == "member"),
                   total_casual_rides = sum(member_casual == "casual"),
                   mean_duration_minutes_member = mean(trip_duration_minutes[member_casual == "member"], na.rm = TRUE), 
                   mean_duration_minutes_casual = mean(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE),
                   min_duration_minutes_member = suppressWarnings(min(trip_duration_minutes[member_casual == "member"], na.rm = TRUE)), 
                   min_duration_minutes_casual = suppressWarnings(min(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE)),
                   max_duration_minutes_member = suppressWarnings(max(trip_duration_minutes[member_casual == "member"], na.rm = TRUE)), 
                   max_duration_minutes_casual = suppressWarnings(max(trip_duration_minutes[member_casual == "casual"], na.rm = TRUE))
  ) %>% 
  mutate_all(function(x) ifelse(is.infinite(x), 0, x)) %>%
  mutate_all(function(x) ifelse(is.nan(x), 0, x)) %>% 
  # When grouping by datetime field it is restored in the new tibbles as double so
  # let's reformat it as date
  mutate(started_at_date = lubridate::as_date(started_at_date))
 
Daily Ride Statistic per Rider Type Table

 

Day-of-the-Week rides statistics per rider type

month_day_member_type_rides <- ddply(cleaned_rides, c("started_at_month", "started_at_week_day", "member_casual"), summarise,
               total_rides      = length(trip_duration_minutes),
               avg_ride_minutes = round(mean(trip_duration_minutes), 2),
               min_ride_minutes = min(trip_duration_minutes),
               max_ride_minutes = max(trip_duration_minutes))
 
Month-Day Ride Statistic per Rider Type Table

 

Station usage per member type

station_member_type_rides <- cleaned_rides %>% 
  dplyr::group_by(start_station_name, member_casual) %>% 
  dplyr::summarise(total_rides = n())
 
Trip Start Station Usage per Rider Type Table

 

Bicycle type usage per member type

# Note that I must specify the dplyr library for the summarise since plyr is installed after and have same functions
ridetype_member_type_rides <- cleaned_rides %>% 
  dplyr::group_by(rideable_type, member_casual) %>% 
  dplyr::summarise(total_rides = n())
 
Bicycle Type Usage per Rider Type Table

 

Visualize

Summary

Available Ride Observation Summary Table

From the first summary I can already observe that:

Preliminary Hypothesis

Repeated rides of similar duration may indicate business related purpose while varied and typically longer duration rides may indicate leisure purpose.
Where ride purpose categories are defined as:

My hypothesis is that larger number of member group rides are of business purpose while the opposite is true for casual group rides.

Furthermore, since the number of casual rides is very close to the number of member rides, I can confirm that converting casual riders to membership does indeed make sense with the following caveat: The datasets do not include rider-ID - only ride-ID - thus I do not know how many of the riders - especially casual ones - are repeated riders nor where they reside (that could be very different than the starting station proximity for out-of-town visitors).

Examine 24-hours cycle differences between member riders and casual ones

 

Examining the 24-hours number of rides cycle reveals:

Much higher increase in number of member rides is occurring during the rush-hours when people mostly ride to or from work or school. This strengthens the hypothesis of leisure versus business rides proportion for casual versus member groups.

Monthly differences between member riders and casual ones - is there a pattern?

 

Day-of-the-Week differences between member riders and casual ones. Is there a daily patterns?

 

 

Is there any difference between casual and member riders in utilization of bycicle types?

 

From the above chart I can discern the following:

  • Member riders use classical bikes 22% more than casual riders.
  • There is no significant difference in electric bicycle utilization between member and casual riders.
  • Relatively insignificant number of riders ride docked bicycle though casual riders use those much more than member ones.

This provides further support for my hypothesis. Since member rides are more for business purpose and are rides are much shorter on average, repeated and “mandatory”, member riders probably utilize whatever bicycle is available even if it is manual.
I do not have information how docked bicycle is different from either classic or electric bicycles, nor how the type affects the membership plan, but its usage is negligible.

Is there a pattern difference for station usage between the two groups?

I will use the ride-start stations, excluding the ones lacking names (or addresses) since coordinates without a station name can not be grouped and thus are not useful here.

 

From the stations-rides map I can observe the following:

Stations for which the number of casual rides is significantly larger than the number of member rides are located in areas of leisure such as along walking paths near the lake; near parks, museums and other such related destinations. Stations for which the opposite is the case are in areas of commerce such as near major transportation centers (probably allowing use of buses/trains to reach farther work places); near business buildings along the river, hospitals, universities and similar. The following shows several examples:

 

The pattern of relationship between points of interest and rider group type supports my hypothesis that leiure riding is more dominant in the casual group while business riding is more dominant in the member group.

I do not have information regarding the current membership pricing - i.e. whether tiers are based on frequency, length of rides and so on. Furthermore, as mentioned previously I do not know how many riders of either group - especially the casual one - are repeated customers. However I can provide some recommendations.

Share

I shared the visuals and my hypothesis regarding the nature of the rides with the stakeholders.
I reiterated:

The various graphs support my hypothesis and I provided my recommendations accordingly.

Act

Recommendations

Additional studdies be performed in order to extract missing details that may shed more light on why so many riders are casual and do not purchase annual memberships.

  • Conduct surveys collecting information about features riders - especially casual - like and want to see added.

  • Collect rider identification - while maintaining anonymity - including riding purpose, age range, visitor/resident and so no. This information in conjunction with the ride information will help to identify trends and causality better.

Features to be implemented as part of the strategy for converting casual riders to membership riders.

  • Tailor membership based on ride length tiers for people who ride less frequently but for longer duration.

  • Add stations near leisure points of interest, sport related locations, visual art, performing art, libraries and such.

  • Develop membership-only apps providing benefits for various leisure activities such as fitness tracker when riding for exercise.

  • As part of membership - provide notifications about performances, lectures, sales and other attractions happening near stations.

  • Form business relationships with various leisure venues requiring tickets to attend events so that discounted tickets (and/or reservation) can be offered as part of membership.

 

Thank You!